Kalshi Analysis 2025 (Updated May 31st)

Author

Benjamin Sherman

Introduction

In 2025, I began trading on Kalshi, a prediction market platform where users can trade on the outcome of real-world events. This ranges from political results to sports results. Particularly from April onwards, I began putting a bit more thought into my trading strategies, and thus I thought it apt to journal this process (i.e. lessons, profitability, trends, and anything else of note.) This is meant to be more of a journal for fun with an emphasis on the results/narratives, not really a showcase of coding or anything scientific/professional (no stochastic calculus to be found here), so all of the code is hidden. If you’re interested in the code, just click “Show code” above any result.

Basic Statistics

View below code to see library loading

Show code
#install.packages("kableExtra")
suppressPackageStartupMessages(library(tidymodels))
tidymodels_prefer()
suppressPackageStartupMessages(library(tidyverse))
library(kableExtra)
library(glue)
library(rUM)
library(rio)
library(table1)
library(knitr)
library(gt)
library(broom)
library(conflicted)

View below code to see dataset loading/cleaning

Show code
Kalshi25 <- read.csv("~/Downloads/MayKalshi.csv", stringsAsFactors = FALSE)

scales::dollar_format()
function (x) 
{
    dollar(x, accuracy = accuracy, scale = scale, prefix = prefix, 
        suffix = suffix, big.mark = big.mark, decimal.mark = decimal.mark, 
        trim = trim, largest_with_cents = largest_with_cents, 
        negative_parens = negative_parens, ...)
}
<bytecode: 0x1423638f0>
<environment: 0x142362e00>

Kalshi25$Created_Clean <- gsub(" at ", " ", Kalshi25$Created)
Kalshi25$Created_Clean <- gsub(" EST", "", Kalshi25$Created_Clean)

Kalshi25$Created_Parsed <- parse_date_time(Kalshi25$Created_Clean, orders = "b d, Y I:Mp")

Kalshi25$Month <- format(Kalshi25$Created_Parsed, "%B")
Kalshi25$DayOfWeek <- weekdays(Kalshi25$Created_Parsed) 
Kalshi25$Hour <- hour(Kalshi25$Created_Parsed)              

month_levels <- month.name  # "January" to "December"
day_levels <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")

Kalshi25 <- Kalshi25 |> select(-Created)

glimpse(Kalshi25)
Rows: 10,883
Columns: 14
$ Type             <chr> "settlement", "credit", "trade", "settlement", "trade…
$ Ticker           <chr> "GPT5-24DEC31", "", "HOUSEMOV-24-R-B1-5", "PRES-2024-
$ Contracts        <int> 500, 0, 486, 100, 104, 562, 32, 29, 50, 2, 4, 24, 230
$ Direction        <chr> "Yes", "Yes", "No", "Yes", "Yes", "Yes", "No", "Yes",…
$ Average_Price    <int> 7, 0, 2, 1, 46, 6, 62, 66, 4, 50, 51, 51, 4, 47, 1, 1
$ Realized_Revenue <chr> "0.00", "$1.09", "0.00", "0.00", "0.00", "0.00", "0.0…
$ Realized_Cost    <chr> "$35.00", "0.00", "0.00", "$1.00", "0.00", "0.00", "0
$ Realized_Profit  <chr> "-$35.00", "+$1.09", "$0.00", "-$1.00", "$0.00", "$0.…
$ Fees             <chr> "0.00", "0.00", "0.00", "0.00", "$1.81", "$2.22", "$0
$ Created_Clean    <chr> "Jan 1, 2025 2:28AM", "Jan 1, 2025 3:08AM", "Jan 6, 2…
$ Created_Parsed   <dttm> 2025-01-01 02:28:00, 2025-01-01 03:08:00, 2025-01-06…
$ Month            <chr> "January", "January", "January", "January", "January"…
$ DayOfWeek        <chr> "Wednesday", "Wednesday", "Monday", "Monday", "Monday…
$ Hour             <int> 2, 3, 14, 13, 2, 2, 16, 23, 23, 23, 23, 23, 2, 2, 10,…

For fun, let’s start simple by looking at some straight numbers thus far. Nothing fancy.

Firstly, my profit in 2025:

Show code
Kalshi25$Realized_Profit_Clean <- as.numeric(gsub("[\\$,]", "", Kalshi25$Realized_Profit))
total_realized_profit <- sum(Kalshi25$Realized_Profit_Clean, na.rm = TRUE)


data.frame(Total_Realized_Profit = dollar(total_realized_profit)) |>
  kable(caption = "", align = "c") |>
  kable_styling(full_width = FALSE)
Total_Realized_Profit
$4,819.00

Welp, the number isn’t negative. That’s always a good start. Given that I started with $200, I’m proud of this.

How about the amount of money I’ve lost in 2025 to fees?

Show code
Kalshi25$Fees_Clean <- as.numeric(gsub("[\\$,]", "", Kalshi25$Fees))
total_fees <- sum(Kalshi25$Fees_Clean, na.rm = TRUE)

data.frame(Total_Fees = dollar(total_fees)) %>%
  kable(caption = "", align = "c") %>%
  kable_styling(full_width = FALSE)
Total_Fees
$1,612.01

Holy shit. This is surprising. The profit figure above already takes fees into account, but still. It’s clear that these minimal fees have built up for me over time, but it also shows how Kalshi remains profitable.

As a conservative estimate, let’s assume there are 5,000 traders who are more trading in higher volume than me (it’s probably higher.) This means there are 5,000 traders who likely have paid more in fees. Again, let’s be conservative and assume the average person in this group pays about $3,000 in fees (I know a few people with higher volume than myself who have paid five-figure amounts in fees, so I’m confident that this is conservative.) That would suggest $15,000,000 in revenue for Kalshi so far in 2025, and that’s just from the population of traders who have greater order flow than myself. Damn.

Of course, this isn’t very scientific; 5,000 was a guesstimate and I’m also assuming everyone else is paying fees at the same rate. The funny thing is that I actually believe I’m paying fees at a smaller rate than others. My market making/arbitrage strategies are almost all based on resting orders as opposed to immediately clearing orders, and resting orders incur fewer fees than clearing orders. AND we’re not accounting for the LARGE majority of traders that are trading in smaller volume than me. That would suggest that Kalshi’s total revenue in 2025 is a much, much higher number, but I digress.

Another quick sidepoint: this helps makes sense as to why the Kalshi team is so responsive. I’ve submitted a few tickets to customer support and they always surprised me with how quickly they responded. If 5,000 people are generating them 8-figures in fees, then it makes sense that they would coddle, encourage, and care for them as much as possible. Besides, the people who care enough to submit tickets and complain are usually high-volume traders.

Now, let’s look at some basic statistics. How about some frequencies? More specifically, the frequency with which I’m trading. Each number below represents the total count of trades completed.

Show code
Kalshi25$Month <- factor(Kalshi25$Month, 
                         levels = month.name, ordered = TRUE)

month_counts <- table(Kalshi25$Month)


month_counts |>
  as.data.frame() |>
  setNames(c("Month", "Count")) |>
  gt() |>
  tab_header(
    title = "Trades by Month"
  )
Trades by Month
Month Count
January 14
February 425
March 1198
April 2107
May 6909
June 230
July 0
August 0
September 0
October 0
November 0
December 0
Show code

Kalshi25$DayOfWeek <- factor(Kalshi25$DayOfWeek, 
                             levels = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"), 
                             ordered = TRUE)

day_counts <- table(Kalshi25$DayOfWeek)

day_counts |>
  as.data.frame() |>
  setNames(c("Day", "Count")) |>
  gt() |>
  tab_header(
    title = "Trades by Day"
  )
Trades by Day
Day Count
Monday 873
Tuesday 1400
Wednesday 1729
Thursday 1228
Friday 1298
Saturday 1843
Sunday 2512

I already knew this, but the increasing rate with which I’m trading recently is exceptional. I traded more in May than in the four months prior combined (3730 vs 6909.)

Also, regarding weekly trends, I view my Kalshi trading rate as having a direct association with how much free time and energy I have that day. Higher trading on weekends as well as Mondays being slow suggest this theory.

As for the profitability of these different times, there will be discussions of that later on.

Anyway, raw numbers are fine, but graphs are better.

Basic Graphs

Let’s use a graph to look at the frequency of trading “yes” vs “no” contracts.

Show code
ggplot(Kalshi25, aes(x = Direction)) +
  geom_bar(fill = "darkblue", alpha = 0.7, width = 0.6) +
  geom_text(
    stat = "count", 
    aes(label = after_stat(count)), 
    vjust = -0.5, 
    size = 5, 
    family = "Courier",
    color = "black"
  ) +
  labs(
    title = "Number of Trades by Direction",
    x = "Direction",
    y = "Count"
  ) +
  expand_limits(y = max(table(Kalshi25$Direction)) * 1.1) +
  theme_minimal(base_size = 15, base_family = "Courier") +
  theme(
    plot.background = element_rect(fill = "#FFFFFF", color = NA),
    panel.background = element_rect(fill = "#FFFFFF", color = NA),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    axis.title = element_text(face = "bold"),
    axis.text = element_text(color = "black"),
    panel.grid.major.x = element_blank()
  )

It should be surprising that I trade “No” contracts more than “Yes.” “Yes” is usually the more intuitive/easier option. I mean, if you ask John Doe whether he thinks the Thunder or Pacers will win the NBA championship, he doesn’t respond with “The Pacers won’t win.” He says “The Thunder will win.”

However, there’s a reason for this. When I build arbitrages, I tend to build them with “No” orders since “No” orders tend to clear faster. This is because buying the “No” is the same as selling the “Yes”, and people will buy your “Yes” contracts more than the “No” by the aforementioned logic on “Yes” being intuitive.

Click below to view mutation steps taken

Show code
# Baseball Arbitrages
baseball_df <- Kalshi25 %>%
  filter(str_starts(Ticker, "KXMLBGAME")) %>%
  filter(!is.na(Realized_Profit_Clean)) %>%
  mutate(
    GameID = str_replace(Ticker, "-[^-]+$", ""),
    Contracts = as.numeric(Contracts)  # Ensure numeric within pipeline
  )

baseball_collapsed <- baseball_df %>%
  group_by(GameID) %>%
  summarise(
    Ticker = first(GameID),
    Realized_Profit_Clean = sum(Realized_Profit_Clean, na.rm = TRUE),
    Contracts = sum(Contracts, na.rm = TRUE),
    Created_Parsed = min(Created_Parsed, na.rm = TRUE),
    .groups = "drop"
  )

non_baseball_df <- Kalshi25 %>%
  filter(!str_starts(Ticker, "KXMLBGAME"))

Kalshi25_arb <- bind_rows(
  non_baseball_df,
  baseball_collapsed
) %>%
  arrange(Created_Parsed)

# WNBA Arbitrages
womensbball <- Kalshi25_arb %>%
  filter(str_starts(Ticker, "KXWNBAGAME")) %>%
  filter(!is.na(Realized_Profit_Clean)) %>%
  mutate(
    GameID = str_replace(Ticker, "-[^-]+$", ""),
    Contracts = as.numeric(Contracts)
  )

womensbball_collapsed <- womensbball %>%
  group_by(GameID) %>%
  summarise(
    Ticker = first(GameID),
    Realized_Profit_Clean = sum(Realized_Profit_Clean, na.rm = TRUE),
    Contracts = sum(Contracts, na.rm = TRUE),
    Created_Parsed = min(Created_Parsed, na.rm = TRUE),
    .groups = "drop"
  )

non_womensbball <- Kalshi25_arb %>%
  filter(!str_starts(Ticker, "KXWNBAGAME"))

Kalshi25_arb <- bind_rows(
  non_womensbball,
  womensbball_collapsed
) %>%
  arrange(Created_Parsed)

Alright, take two. Let’s look at the biggest losses and wins now that we’ve accounted for arbitrages.

Show code
top_wins_no_arb <- Kalshi25_arb |>
  arrange(desc(Realized_Profit_Clean)) |>
  slice_head(n = 5)

top_losses_no_arb <- Kalshi25_arb |>
  arrange(Realized_Profit_Clean) |>
  slice_head(n = 5)

biggest_outliers_no_arb <- bind_rows(top_wins_no_arb, top_losses_no_arb)

biggest_outliers_no_arb |>
  select(Date = Created_Parsed, Profit = Realized_Profit_Clean, Ticker) %>%
  mutate(
    Date = format(Date, "%b %d, %Y %I:%M %p"),
    Profit = dollar(Profit)
  ) %>%
  kable(caption = "Top 5 Gains and Losses by Trade With Arbitrages Combined", align = "c") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Top 5 Gains and Losses by Trade With Arbitrages Combined
Date Profit Ticker
Feb 02, 2025 05:46 PM $268.80 KXGRAMBCS-67-TA
Mar 21, 2025 09:32 PM $181.14 KXPRESLEAVESK-25-APR
Apr 07, 2025 10:11 PM $165.20 KXMARMAD-25-HOU
Apr 25, 2025 12:40 AM $150.00 KXNBAGAME-25APR24OKCMEM-OKC
May 24, 2025 10:19 PM $109.21 KXMLBGAME-25MAY24PHIATH
Apr 04, 2025 11:52 AM -$286.66 KXPRESLEAVESK-25-MAY
Mar 23, 2025 11:19 AM -$181.83 KXAPPRANKFREE-25MAR23-NCA
May 21, 2025 09:23 PM -$172.18 KXTRUMPMENTION-25MAY21-REF
May 21, 2025 02:58 PM -$121.52 KXMLBGAME-25MAY21BALMIL
Apr 20, 2025 05:59 PM -$77.23 KXMLBGAME-25APR20SFLAA


Ok! We’re back in buisness. As a quick disclaimer, the “ticker” variable is the ID for the market that the trade took place in. I know the “ticker” ID can be difficult to understand without context, but I can use them to recall what these trades were.

Notably, my second-most-profitable trade was on the South Korean presidency market, for $181, on Mar 21. Not to be outdone, I quickly lost $286 two weeks later on the same market on Apr 04. Impressive.

It’s cool to see how the largest profits and losses come from markets where I would wait for resolution, as opposed to trading up and down. It makes sense. For example, sports markets are heavily botted and very efficient (i.e. impossible to market-make on) so I would just buy a position and hold (what’s that, gambling?) This is evidenced by 2 of the 5 most profitable trades being in basketball markets.

Lastly, my largest win came from a market with the ticker “KXGRAMBCS-67-TA”. After phoning a friend, I recalled that this was from the Grammy market. Funny enough, my largest win comes from a really boring and unimpressive strategy. All I did here was read a Rolling Stone’s article predicting the Grammy winners and see that one of their predicted winners (The Architect for Country Song of the Year) was only given 4% odds. I trust Rolling Stones more than the Kalshi masses when it comes to the Grammys, so I threw $11 on it. From that, we get the $268 profit. Below is my Kalshi-generated receipt.

There are stories behind each of these trades (one would imagine so, each of these numbers are a lot of money to lose/gain) but I won’t bore you.

Now that we’ve combined arbitrages and discussed outliers, let’s move on past the outliers and zoom in on the scatterplot.

Show code
filtered_df <- Kalshi25_arb %>%
  filter(Created_Parsed >= as.POSIXct("2025-02-01"))

ggplot(filtered_df, aes(x = Created_Parsed, y = Realized_Profit_Clean)) +
  geom_hline(yintercept = 0, color = "gray30", linetype = "dashed", linewidth = 0.7) +
  geom_point(alpha = 0.6, color = "darkblue", size = 2) +
  scale_y_continuous(labels = dollar_format()) +
  coord_cartesian(ylim = c(-10, 10)) +
  labs(
    title = "Zoomed-In Profit by Invidual Trades Over Time",
    x = "Date",
    y = "Realized Profit"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold"),
    axis.title = element_text(face = "bold"),
    axis.text = element_text(color = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_line(color = "gray80"),
    panel.grid.major.y = element_line(color = "gray80")
  )

Much better than the previous scatterplot. The first thing that jumps out to me is how clustered these trades are. These vertical columns of trades exemplify how heavily I relied on random profitable markets that would pop up and then go away. I wasn’t doing daily, consistent trading, as much as I was sporadic. For example, these columns in late March are different March Madness games that I found profitable (and correctly so, as shown by the columns being largely above the $0 line.)

Other things:

  1. Again, increased trading as time goes on.

  2. Soooooo many trades were net-neutral. The central axis is basically blue because of how many dots are consistently layered over one another.

What about looking at my best/worst markets?